AITopics | gesture recognition

2512.07997

Country:

North America > Canada > Newfoundland and Labrador > Newfoundland > St. John's (0.04)
North America > United States (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Health Care Technology (0.88)
Health & Medicine > Therapeutic Area > Neurology (0.48)
Health & Medicine > Therapeutic Area > Musculoskeletal (0.46)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Vision > Gesture Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

arXiv.org Artificial IntelligenceNov-25-2025

ReactEMG: Stable, Low-Latency Intent Detection from sEMG via Masked Modeling

Wang, Runsheng, Zhu, Xinyue, Chen, Ava, Xu, Jingxi, Winterbottom, Lauren, Nilsen, Dawn M., Stein, Joel, Ciocarlie, Matei

Surface electromyography (sEMG) signals show promise for effective human-machine interfaces, particularly in rehabilitation and prosthetics. However, challenges remain in developing systems that respond quickly to user intent, produce stable flicker-free output suitable for device control, and work across different subjects without time-consuming calibration. In this work, we propose a framework for EMG-based intent detection that addresses these challenges. We cast intent detection as per-timestep segmentation of continuous sEMG streams, assigning labels as gestures unfold in real time. We introduce a masked modeling training strategy that aligns muscle activations with their corresponding user intents, enabling rapid onset detection and stable tracking of ongoing gestures. In evaluations against baseline methods, using metrics that capture accuracy, latency and stability for device control, our approach achieves state-of-the-art performance in zero-shot conditions. These results demonstrate its potential for wearable robotics and next-generation prosthetic systems. Our project website, video, code, and dataset are available at: https://reactemg.github.io/

large language model, machine learning, natural language, (20 more...)

2506.19815

Country:

North America > United States (0.04)
Asia > China (0.04)
South America > Uruguay > Maldonado > Maldonado (0.04)
Asia > Thailand > Bangkok > Bangkok (0.04)

Genre: Research Report > New Finding (0.66)

Industry:

Health & Medicine > Consumer Health (1.00)
Health & Medicine > Therapeutic Area > Neurology (0.49)

Technology:

Information Technology > Human Computer Interaction > Interfaces (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Architecture (1.00)
(2 more...)

Neural Information Processing SystemsNov-20-2025, 23:23:13 GMT

Attention in Convolutional LSTM for Gesture Recognition

Liang Zhang, Guangming Zhu, Lin Mei, Peiyi Shen, Syed Afaq Ali Shah, Mohammed Bennamoun

On this basis, a new variant of LSTM is derived, in which the convolutional structures are only embedded into the input-to-state transition of LSTM.

artificial intelligence, machine learning, variant, (18 more...)

Country:

Oceania > Australia > Western Australia (0.04)
Oceania > Australia > Queensland (0.04)
North America > Canada > Quebec > Montreal (0.04)
Asia > China (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Neural Information Processing SystemsNov-20-2025, 12:24:00 GMT

EMGBench: Benchmarking Out-of-Distribution Generalization and Adaptation for Electromyography Jehan Y ang

By predicting the user's intended

artificial intelligence, data quality, machine learning, (20 more...)

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > Colorado (0.04)
Europe > United Kingdom > England > Tyne and Wear > Sunderland (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Information Management (0.93)
(5 more...)

Neural Information Processing SystemsNov-20-2025, 10:53:53 GMT

Attention in Convolutional LSTM for Gesture Recognition

Liang Zhang, Guangming Zhu, Lin Mei, Peiyi Shen, Syed Afaq Ali Shah, Mohammed Bennamoun

Neural Information Processing Systems http://nips.cc/

artificial intelligence, machine learning, variant, (18 more...)

Country:

Oceania > Australia > Western Australia (0.04)
Oceania > Australia > Queensland (0.04)
North America > Canada > Quebec > Montreal (0.04)
Asia > China (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Neural Information Processing SystemsNov-16-2025, 02:43:22 GMT

SpGesture: Source-Free Domain-adaptive sEMG-based Gesture Recognition with Jaccard Attentive Spiking Neural Network

To validate SpGesture's performance, we collected a new sEMG gesture dataset which has different forearm postures, where SpGesture

artificial intelligence, machine learning, neural network, (16 more...)

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > China > Guangdong Province > Guangzhou (0.04)
Asia > Singapore (0.04)
(3 more...)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.93)

Industry:

Information Technology (1.00)
Health & Medicine > Therapeutic Area (0.68)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceNov-13-2025

SASG-DA: Sparse-Aware Semantic-Guided Diffusion Augmentation For Myoelectric Gesture Recognition

Liu, Chen, Han, Can, Xu, Weishi, Wang, Yaqi, Qian, Dahong

Abstract-- Surface electromyography (sEMG)-based gesture recognition plays a critical role in human-machine interaction (HMI), particularly for rehabilitation and prosthetic control. However, sEMG-based systems often suffer from the scarcity of informative training data, leading to overfitting and poor generalization in deep learning models. Data augmentation offers a promising approach to increasing the size and diversity of training data, where faithfulness and diversity are two critical factors to effectiveness. However, promoting untargeted diversity can result in redundant samples with limited utility. To address these challenges, we propose a novel diffusion-based data augmentation approach, Sparse-Aware Semantic-Guided Diffusion Augmentation (SASG-DA). To enhance generation faithfulness, we introduce the Semantic Representation Guidance (SRG) mechanism by leveraging fine-grained, task-aware semantic representations as generation conditions. To enable flexible and diverse sample generation, we propose a Gaussian Modeling Semantic Sampling (GMSS) strategy, which models the semantic representation distribution and allows stochastic sampling to produce both faithful and diverse samples. To enhance targeted diversity, we further introduce a Sparse-Aware Semantic Sampling (SASS) strategy to explicitly explore underrepresented regions, improving distribution coverage and sample utility. Extensive experiments on benchmark sEMG datasets, Ninapro DB2, DB4, and DB7, demonstrate that SASG-DA significantly outperforms existing augmentation methods. Overall, our proposed data augmentation approach effectively mitigates overfitting and improves recognition performance and generalization by offering both faithful and diverse samples. Esture recognition serves as a fundamental technology for advancing human-machine interaction. Among various gesture recognition modalities, surface electromyography (sEMG)-based approaches have gained increasing attention due to their non-invasive nature, high temporal resolution, and ability to directly capture muscle activation signals associated with voluntary movement [1].

artificial intelligence, deep learning, machine learning, (16 more...)

2511.08344

Country:

Asia > China > Shanghai > Shanghai (0.04)
Asia > China > Zhejiang Province > Hangzhou (0.04)

Genre: Research Report > Promising Solution (0.34)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.54)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Vision > Gesture Recognition (0.93)

Zhang, Xijie, He, Fengliang, Dai, Hong-Ning

Achieving Effective Virtual Reality Interactions via Acoustic Gesture Recognition based on Large Language Models

arXiv.org Artificial IntelligenceNov-11-2025

Natural and efficient interaction remains a critical challenge for virtual reality and augmented reality (VR/AR) systems. Vision-based gesture recognition suffers from high computational cost, sensitivity to lighting conditions, and privacy leakage concerns. Acoustic sensing provides an attractive alternative: by emitting inaudible high-frequency signals and capturing their reflections, channel impulse response (CIR) encodes how gestures perturb the acoustic field in a low-cost and user-transparent manner. However, existing CIR-based gesture recognition methods often rely on extensive training of models on large labeled datasets, making them unsuitable for few-shot VR scenarios. In this work, we propose the first framework that leverages large language models (LLMs) for CIR-based gesture recognition in VR/AR systems. Despite LLMs' strengths, it is non-trivial to achieve few-shot and zero-shot learning of CIR gestures due to their inconspicuous features. To tackle this challenge, we collect differential CIR rather than original CIR data. Moreover, we construct a real-world dataset collected from 10 participants performing 15 gestures across three categories (digits, letters, and shapes), with 10 repetitions each. We then conduct extensive experiments on this dataset using an LLM-adopted classifier. Results show that our LLM-based framework achieves accuracy comparable to classical machine learning baselines, while requiring no domain-specific retraining.

large language model, machine learning, recognition, (21 more...)

2511.07085

Country:

Asia > China > Hong Kong (0.04)
North America > United States > Florida > Miami-Dade County > Miami (0.04)
North America > Canada (0.04)

Genre: Research Report > New Finding (0.48)

Industry:

Education (0.68)
Health & Medicine (0.46)

Technology:

Information Technology > Human Computer Interaction > Interfaces > Virtual Reality (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Akremi, Mohamed Sanim, Slama, Rim, Tabia, Hedi

Accurate online action and gesture recognition system using detectors and Deep SPD Siamese Networks

arXiv.org Artificial IntelligenceNov-10-2025

Human activity recognition is an important research topic in pattern recognition field. It has been the subject of many studies in the past two decades because of its importance in numerous areas such as security, health, daily activity, energy consumption and robotics. Recently, some works on the recognition of hand gestures or human actions from skeletal data are based on the modeling of the skeleton's movement as manifold-based representation and proposed deep neural networks on this structure [1, 2, 3]. These approaches demonstrated their potential in the processing of skeletal data. Most of them are applied on offline human action recognition which is useful in time-limited tasks. However, in many applications, simply recognizing a single gesture in a given segmented sequence is not enough, especially in monitoring systems and virtual-reality devices which need to detect human movements moment by moment in continuous videos. In these online recognition systems, it is important to detect the existence of an action as early as possible after its beginning. It is also essential to determine the nature of the movement within a sequence of frames, without having information about the number of gestures present within the video, their starting times or their durations, unlike the segmented action recognition. In this paper, we propose to use a manifold-based model in order to build an online motion recognition system that detects and identifies different human activities in unsegmented skeletal sequences.

artificial intelligence, machine learning, recognition, (20 more...)

2511.0525

Country:

Europe > France > Île-de-France > Paris > Paris (0.04)
Europe > France > Auvergne-Rhône-Alpes > Lyon > Lyon (0.04)

Genre:

Research Report (0.64)
Overview (0.48)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Øhrstrøm, Christoffer Koo, Güldenring, Ronja, Nalpantidis, Lazaros

Spiking Patches: Asynchronous, Sparse, and Efficient Tokens for Event Cameras

arXiv.org Artificial IntelligenceOct-31-2025

W e propose tokenization of events and present a tokenizer, Spiking Patches, specifically designed for event cameras. Given a stream of asynchronous and spatially sparse events, our goal is to discover an event representation that preserves these properties. Prior works have represented events as frames or as voxels. However, while these representations yield high accuracy, both frames and voxels are synchronous and decrease the spatial sparsity. Spiking Patches gives the means to preserve the unique properties of event cameras and we show in our experiments that this comes without sacrificing accuracy. W e evaluate our tokenizer using a GNN, PCN, and a Transformer on gesture recognition and object detection. T okens from Spiking Patches yield inference times that are up to 3.4x faster than voxel-based tokens and up to 10.4x faster than frames. W e achieve this while matching their accuracy and even surpassing in some cases with absolute improvements up to 3.8 for gesture recognition and up to 1.4 for object detection. Thus, tokenization constitutes a novel direction in event-based vision and marks a step towards methods that preserve the properties of event cameras.

machine learning, pattern recognition, spiking patch, (18 more...)

2510.26614

Country:

Europe > Denmark (0.04)
North America > United States > California > San Diego County > San Diego (0.04)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (0.70)